Data sets

Throughout the course we will use the two data sets described below.
The data sets can be downloaded from the Materials page.

Pulse

Students in an introductory statistics class (MS212 taught by Professor John Eccleston and Dr Richard Wilson at The University of Queensland) participated in a simple experiment. The students measured their own pulse rate. They were then asked to flip a coin. If the coin came up heads, they were to run in place for one minute. Otherwise they sat without movement for one minute. Then everyone measured their pulse again. The pulse rates and other physiological and lifestyle data are given in the data table.

Variable Explanation
name Name of a participant
height Height (cm)
weight Weight (kg)
age Age (years)
gender Sex (male/female)
smokes Regular smoker? (yes/no)
alcohol Regular drinker? (yes/no)
exercise Frequency of exercise (high/moderate/low)
ran Whether the student ran or sat between the first and second pulse measurements (ran/sat)
pulse1 First pulse measurement (rate per minute)
pulse2 Second pulse measurement (rate per minute)
year Year of the class (1993 - 1998)

The pulse data set is available in the data folder as comma-delimited text (extension .csv) file pulse.csv.

# note, we keep 'pulse.csv' file in 'data' directory of the project
pulse <- read_csv("data/pulse.csv", show_col_types = FALSE)
pulse 
# A tibble: 110 × 13
   id     name  height weight   age gender smokes alcohol exerc…¹ ran   pulse1 pulse2
   <chr>  <chr>  <dbl>  <dbl> <dbl> <chr>  <chr>  <chr>   <chr>   <chr>  <dbl>  <dbl>
 1 1993_A Bonn…    173     57    18 female no     yes     modera… sat       86     88
 2 1993_B Mela…    179     58    19 female no     yes     modera… ran       82    150
 3 1993_C Cons…    167     62    18 female no     yes     high    ran       96    176
 4 1993_D Trav…    195     84    18 male   no     yes     high    sat       71     73
 5 1993_E Lauri    173     64    18 female no     yes     low     sat       90     88
 6 1993_F Geor…    184     74    22 male   no     yes     low     ran       78    141
 7 1993_G Cher…    162     57    20 female no     yes     modera… sat       68     72
 8 1993_H Fran…    169     55    18 female no     yes     modera… sat       71     77
 9 1993_I Sonja    164     56    19 female no     yes     high    sat       68     68
10 1993_J Troy     168     60    23 male   no     yes     modera… ran       88    150
# … with 100 more rows, 1 more variable: year <dbl>, and abbreviated variable name
#   ¹​exercise

Survey

This data frame contains the responses of 233 Statistics I students at the University of Adelaide to a number of questions.
It is a slightly modified version of the survey data from the MASS pacakge.

Variable Explanation
name Name of a participant
gender Sex (male/female)
span1 Span (distance from tip of thumb to tip of little finger of spread hand) of writing hand (cm)
span2 Span of non-writing hand (cm)
hand Writing hand of student (left/right)
fold Fold your arms! which is on top? (right/left/neither)
pulse Pulse measurement (rate per minute)
clap Clap your hands! which is on top? (right/left/neither)
exercise Frequency of exercise (freq/some/none)
smokes How much the student smokes (heavy/regul/occas/never)
height Height (cm)
m.i whether the student expressed height in imperial (feet/inches) or metric (centimetres/metres) units. (metric/imperial)
age Age of the student (years)

The survey data set is available in the data folder as comma-delimited (.csv) text: survey.csv.

# note, we keep 'survey.csv' file in 'data' directory of the project
survey <- read_csv("data/survey.csv", show_col_types = FALSE)
survey 
# A tibble: 233 × 13
   name  gender span1 span2 hand  fold  pulse clap  exerc…¹ smokes height m.i     age
   <chr> <chr>  <dbl> <dbl> <chr> <chr> <dbl> <chr> <chr>   <chr>   <dbl> <chr> <dbl>
 1 Alys… female  18.5  18   right right    92 left  some    never    173  metr…  18.2
 2 Todd  male    19.5  20.5 left  right   104 left  none    regul    178. impe…  17.6
 3 Gera… male    18    13.3 right left     87 neit… none    occas     NA  <NA>   16.9
 4 Robe… male    18.8  18.9 right right    NA neit… none    never    160  metr…  20.3
 5 Dust… male    20    20   right neit…    35 right some    never    165  metr…  23.7
 6 Abby  female  18    17.7 right left     64 right some    never    173. impe…  21  
 7 Andre male    17.7  17.7 right left     83 right freq    never    183. impe…  18.8
 8 Mich… female  17    17.3 right right    74 right freq    never    157  metr…  35.8
 9 Edwa… male    20    19.5 right right    72 right some    never    175  metr…  19  
10 Carl  male    18.5  18.5 right right    90 right some    never    167  metr…  22.3
# … with 223 more rows, and abbreviated variable name ¹​exercise


Copyright © 2023 Biomedical Data Sciences (BDS) | LUMC